Easy2Siksha.com
GNDU QUESTION PAPERS 2022
BA/BSc 6
th
SEMESTER
QUANTITATIVE TECHNIQUES – VI
Time Allowed: 3 Hours Maximum Marks: 100
Note: Aempt Five quesons in all, selecng at least One queson from each secon.
The Fih queson may be aempted from any secon.
All quesons carry equal marks.
SECTION – A
I. Discuss the nature and scope of Econometrics.
How is Econometrics dierent from Mathemacs and Stascs?
II. What is Simple Linear Regression Model?
From the data given below, esmate two-variable Regression Model by OLS method:
X
6
8
12
14
18
20
Y
8
10
6
8
10
12
SECTION – B
III. What is General Linear Regression Model?
Give its assumpons and properes.
IV. Dierenate between R² and Adjusted R².
Use the following data:
X
57
54
Y
13
−7
Esmate the regression line:

Easy2Siksha.com
Esmate R² and Adjusted R², and interpret the results.
Also test the hypothesis that β = 0 against the alternave hypothesis β ≠ 0 at 5% level of
signicance.
SECTION – C
V. Discuss the problem of Heteroscedascity in regression analysis.
What are the tests to detect this problem and explain the remedial measures.
VI. What are the consequences and tests of Mulcollinearity problem in regression
analysis?
SECTION – D
VII. Dierenate between Distributed Lag Models and Auto-Regressive Models.
Discuss the problems of esmaon of Koyck’s Distributed Lag Model.
VIII. What are dummy variables?
Explain the uses of dummy variables.
Easy2Siksha.com
GNDU ANSWER PAPERS 2022
BA/BSc 6
th
SEMESTER
QUANTITATIVE TECHNIQUES – VI
Time Allowed: 3 Hours Maximum Marks: 100
Note: Aempt Five quesons in all, selecng at least One queson from each secon.
The Fih queson may be aempted from any secon.
All quesons carry equal marks.
SECTION – A
I. Discuss the nature and scope of Econometrics.
How is Econometrics dierent from Mathemacs and Stascs?
Ans: Imagine you are trying to understand why some countries grow faster than others, why
the price of vegetables suddenly rises, or how education affects a person’s income.
Economics gives you theories and ideas about these questions, but how do you know if
those ideas are actually true in the real world? This is where Econometrics comes in.
Econometrics is one of the most practical and powerful branches of economics because it
connects theory with reality. It uses mathematical models and statistical tools to test
economic ideas using real-world data. In simple terms, Econometrics is the science of
measuring economic relationships.
Let us understand the nature and scope of Econometrics, and then see how it is different
from Mathematics and Statistics.
Nature of Econometrics
The word Econometrics is made up of two parts: “Econo” (Economics) and “Metrics”
(measurement). Therefore, Econometrics literally means measuring economic phenomena.
1. Econometrics is a Blend of Three Subjects
Econometrics is not a completely separate subject; rather, it combines three important
fields:
Easy2Siksha.com
Economics Provides theories (for example, higher demand leads to higher prices).
Mathematics Helps express these theories in the form of equations.
Statistics Allows us to analyze data and test whether the theory is correct.
For example, economists may believe that “as income increases, consumption also
increases.” Econometrics converts this belief into an equation and then checks it using real
data from households.
2. It is Both Theoretical and Practical
Econometrics is unique because it is not just about abstract thinking. It focuses on real-life
problems such as:
What factors cause unemployment?
How does inflation affect purchasing power?
Does increasing the minimum wage reduce poverty?
By studying actual data, econometrics helps governments, businesses, and researchers
make informed decisions.
3. Focus on Measurement
Unlike traditional economics, which often relies on logical reasoning, econometrics
emphasizes numerical measurement.
For instance, instead of simply saying “advertising increases sales,” econometrics answers
questions like:
󷷑󷷒󷷓󷷔 By how much will sales increase if advertising spending rises by 10%?
This makes economic predictions more reliable.
4. Deals with Uncertainty
Human behavior is unpredictable. Two people earning the same salary may spend very
differently. Econometrics recognizes this uncertainty and uses probability to handle it.
That is why econometric results are rarely 100% certainbut they are highly useful for
understanding trends.
5. Policy-Oriented in Nature
Econometrics plays a major role in policy-making.
Governments use econometric models to:
Forecast economic growth
Predict tax revenue
Easy2Siksha.com
Control inflation
Plan employment programs
Without econometrics, economic policies would be based mostly on guesswork.
Scope of Econometrics
The scope refers to the areas where econometrics is applied. Over time, its scope has
expanded greatly.
1. Testing Economic Theories
Econometrics helps verify whether economic theories actually work in the real world.
For example:
Does lowering interest rates really encourage investment?
Does education improve productivity?
By analyzing data, econometrics either supports or challenges these theories.
2. Forecasting Future Trends
One of the biggest strengths of econometrics is prediction.
It can forecast:
GDP growth
Inflation rates
Stock market trends
Demand for goods
Businesses rely heavily on these forecasts to plan production and investments.
3. Business Decision-Making
Companies use econometric techniques to answer questions such as:
How much should we produce?
What price should we charge?
Which marketing strategy will work best?
For example, an automobile company may analyze past sales data to predict demand for a
new car model.
4. Public Policy and Government Planning
Easy2Siksha.com
Econometrics helps governments design effective policies.
Examples include:
Evaluating poverty reduction programs
Studying the impact of subsidies
Planning infrastructure projects
It ensures that public money is spent wisely.
5. Financial Market Analysis
Banks and financial institutions use econometrics to:
Assess risk
Predict interest rates
Evaluate investment opportunities
This reduces uncertainty in financial decisions.
6. Academic Research
Researchers use econometrics to explore complex issues like income inequality, climate
change impacts, and globalization.
Because of its wide applications, econometrics has become an essential tool in modern
economic research.
Difference Between Econometrics, Mathematics, and Statistics
Students often confuse these subjects because they use numbers and formulas. However,
their purposes are quite different.
Econometrics vs Mathematics
Mathematics is a pure science concerned with numbers, structures, and logical
relationships. It deals mostly with certainty.
For example:
2 + 2 = 4 (always true)
Econometrics, however, deals with real-world situations where certainty is rare.
Key Differences:
Easy2Siksha.com
Mathematics focuses on abstract concepts; econometrics focuses on real economic
problems.
Mathematics provides tools; econometrics applies those tools.
Mathematical results are exact; econometric results are approximate.
󷷑󷷒󷷓󷷔 Think of mathematics as a toolbox, and econometrics as the craftsman using those tools
to build something useful.
Econometrics vs Statistics
Statistics is the science of collecting, organizing, analyzing, and interpreting data. It is used in
many fields such as medicine, psychology, sports, and business.
Econometrics is more specialized.
Key Differences:
Statistics studies data in general; econometrics studies economic data specifically.
Statistics may describe patterns; econometrics explains relationships (cause and
effect).
Econometrics always begins with an economic theory, while statistics may not.
󷷑󷷒󷷓󷷔 For example, statistics might show that ice cream sales and temperature are related.
Econometrics would go further and estimate how much sales increase when temperature
rises.
Conclusion
Econometrics is a powerful discipline that brings economics closer to reality. While
economic theories help us understand how the world should work, econometrics shows us
how the world actually works by testing those theories with data.
Its nature is practical, measurement-based, and policy-oriented. Its scope stretches across
business, government, finance, and research, making it one of the most valuable tools in
modern economics.
Although it uses mathematics and statistics, econometrics is different because it applies
these tools specifically to economic problems. Mathematics gives it structure, statistics gives
it methods, and economics gives it purpose.
Easy2Siksha.com
II. What is Simple Linear Regression Model?
From the data given below, esmate two-variable Regression Model by OLS method:
X
6
8
12
14
18
20
Y
8
10
6
8
10
12
Ans: What is a Simple Linear Regression Model?
A Simple Linear Regression Model is a statistical tool used to study the relationship
between two variables:
Independent variable (X): The predictor or input.
Dependent variable (Y): The outcome or response.
The model assumes that the relationship between X and Y can be expressed as a straight
line:

Where:
= intercept (value of Y when X = 0)
= slope (change in Y for a one-unit change in X)
= error term (captures variation not explained by X)
In simple words: regression tries to draw the “best fit line” through the data points so that
we can predict Y from X.
The Data Provided
We are given:
X
6
8
12
14
18
20
Y
8
10
6
8
10
12
Here, X is the independent variable, and Y is the dependent variable.
Steps to Estimate Regression by OLS (Ordinary Least Squares)
OLS is a method that finds the line which minimizes the sum of squared differences between
actual Y values and predicted Y values.
Step 1: Calculate Means of X and Y



Easy2Siksha.com


So, mean of X = 13, mean of Y = 9.
Step 2: Calculate Slope (b)
Formula:
󰇛
󰇜󰇛
󰇜
󰇛
󰇜
Let’s compute step by step:
X
Y
Product
󰇛
󰇜󰇛
󰇜
Square
󰇛

󰇜
6
8
-7
-1
7
49
8
10
-5
1
-5
25
12
6
-1
-3
3
1
14
8
1
-1
-1
1
18
10
5
1
5
25
20
12
7
3
21
49
Now, sum them up:
󰇛
󰇜󰇛
󰇜
󰇛
󰇜

So,



Step 3: Calculate Intercept (a)
Formula:
󰇛󰇜
Step 4: Regression Equation
So, the estimated regression model is:

Interpretation of the Model
Intercept (6.4): When X = 0, the predicted value of Y is 6.4.
Easy2Siksha.com
Slope (0.2): For every 1-unit increase in X, Y increases by 0.2 units on average.
This means the relationship between X and Y is positive but weakY grows slowly as X
increases.
Making It Relatable
Imagine X as the number of hours studied and Y as the marks scored. The regression line
tells us:
Even if someone doesn’t study (X=0), they might still score around 6.4 marks
(intercept).
For each extra hour of study, marks increase by 0.2 on average (slope).
Of course, actual marks may vary (that’s the error term), but the regression line gives us a
general trend.
Why Regression Matters
Prediction: Helps forecast outcomes (e.g., sales based on advertising).
Understanding Relationships: Shows how one variable influences another.
Decision Making: Guides policies, business strategies, and scientific research.
Conclusion
A Simple Linear Regression Model is a way to describe the relationship between two
variables using a straight line. Using the OLS method, we estimated the regression equation
for the given data as:

This equation shows that Y increases slightly as X increases. Regression is not just math—it’s
a powerful tool to understand patterns in real life, from exam scores to business profits.
SECTION – B
III. What is General Linear Regression Model?
Give its assumpons and properes.
Ans: Imagine you are trying to understand why some students score higher marks than
others. You might think that the number of hours they study, their attendance, sleep habits,
and even their interest in the subject all play a role. But how can we measure exactly how
much each factor influences their performance? This is where the General Linear
Regression Model (GLRM) becomes extremely useful.
Easy2Siksha.com
Let us explore this concept in a simple, story-like manner so that it feels less like
mathematics and more like logical thinking.
󷄧󼿒 What is the General Linear Regression Model?
The General Linear Regression Model is a statistical method used to study the relationship
between one dependent variable (the outcome we want to predict) and two or more
independent variables (the factors that influence the outcome).
󷷑󷷒󷷓󷷔 In simple words, it helps answer questions like:
How does study time affect exam marks?
Does income depend on education level and work experience?
How do rainfall and fertilizer affect crop production?
Instead of guessing, regression gives us a mathematical equation to predict outcomes.
󹵙󹵚󹵛󹵜 The Basic Idea
The model is usually written as:
Y = β₀ + β₁X₁ + β₂X₂ + … + βₙXₙ + ε
Now don’t worry — let’s decode this in everyday language.
Y → The dependent variable (what we want to predict). Example: exam marks.
X₁, X₂, X₃… → Independent variables (factors affecting marks). Example: study hours,
attendance.
β₀ (beta zero) → The intercept. It is the starting value when all factors are zero.
β₁, β₂… → Coefficients. They tell us how much Y changes when X changes.
ε (epsilon) → Error term. It represents unpredictable factors like mood, health, or
luck.
󷷑󷷒󷷓󷷔 So, the regression model is basically saying:
“Your result = predictable factors + unpredictable factors.”
Isn’t that exactly how real life works?
󷘹󷘴󷘵󷘶󷘷󷘸 Why is it called “General”?
Easy2Siksha.com
You might have heard of simple linear regression, which uses only one independent
variable.
Example:
Marks = Study Hours + Error
But life is rarely that simple. Most outcomes depend on multiple factors.
The General Linear Regression Model expands this idea by allowing many variables at once.
That is why it is called “general” — it is flexible and widely applicable.
It is used in:
Economics (predicting demand)
Business (forecasting sales)
Medicine (studying treatment effects)
Psychology (analyzing behavior)
Education (predicting performance)
In short, it is one of the most powerful tools in statistics.
󷄧󼿒 Assumptions of the General Linear Regression Model
For the model to give accurate results, certain conditions must be satisfied. These are called
assumptions.
Think of assumptions like the rules of a game if players follow them, the game remains
fair.
1. Linearity
The relationship between dependent and independent variables should be linear.
󷷑󷷒󷷓󷷔 This does NOT mean the data must form a perfect straight line, but the overall trend
should be straight rather than curved.
Example:
If study hours increase, marks generally increase.
If the relationship is curved (like stress vs productivity), linear regression may not work well.
2. Independence of Errors
Easy2Siksha.com
The errors (ε) should not influence each other.
In simple terms:
󷷑󷷒󷷓󷷔 One student’s performance should not affect another’s error.
This assumption is especially important in time-based data. For example, today’s stock price
error should not depend on yesterday’s error.
3. Homoscedasticity (Constant Variance)
A big word but a simple idea.
It means the spread of errors should remain roughly the same across all values of X.
Imagine throwing darts:
If all darts land evenly around the target → good.
If some are tightly clustered and others are widely scattered → bad.
Unequal spread can make predictions unreliable.
4. Normality of Errors
The error terms should follow a normal distribution (a bell-shaped curve).
Why is this important?
Because many statistical tests depend on normality to produce valid conclusions.
Luckily, in large datasets, this assumption often holds naturally.
5. No Perfect Multicollinearity
Another complex term made easy!
It means independent variables should not be perfectly related to each other.
For example:
Including both age and year of birth is redundant.
Including study hours per day and study hours per week may create overlap.
Easy2Siksha.com
When variables duplicate information, the model becomes confused about which factor is
actually responsible.
󷄧󼿒 Properties of the General Linear Regression Model
Now let us understand what makes this model special.
1. Best Linear Unbiased Estimator (BLUE)
According to the famous GaussMarkov theorem, the regression estimates are:
󷄧󼿒 Best → Minimum possible error
󷄧󼿒 Linear → Based on linear equations
󷄧󼿒 Unbiased → On average, they hit the true value
Think of it like a highly accurate archer who consistently shoots near the bullseye.
2. Simplicity and Interpretability
One of the biggest strengths of linear regression is how easy it is to understand.
If β₁ = 5, it means:
󷷑󷷒󷷓󷷔 Increasing X₁ by one unit increases Y by 5 units.
Even non-experts can interpret this.
3. Predictive Power
Regression is widely used for forecasting.
Examples:
Predicting future sales
Estimating population growth
Forecasting weather trends
Businesses rely heavily on regression to make strategic decisions.
Easy2Siksha.com
4. Flexibility
The general model can handle:
󷄧󼿒 Multiple variables
󷄧󼿒 Dummy variables (like gender: male/female)
󷄧󼿒 Interaction effects
󷄧󼿒 Polynomial extensions
It adapts to many real-world problems.
5. Foundation for Advanced Models
Many complex statistical and machine-learning models are built on the idea of linear
regression.
So once you understand GLRM, you have already taken a big step into advanced analytics.
󷄧󼿒 Conclusion
The General Linear Regression Model is much more than a formula it is a structured way
of understanding relationships in the world around us.
It teaches us that outcomes are rarely random; they are influenced by identifiable factors.
By measuring these influences, regression allows us to explain the past, understand the
present, and even predict the future.
Once you grasp this concept, you begin to see patterns everywhere in education,
business, health, and society.
In fact, the General Linear Regression Model quietly powers many decisions that shape our
daily lives.
IV. Dierenate between R² and Adjusted R².
Use the following data:
X
57
54
Y
13
−7
Easy2Siksha.com
Esmate the regression line:

Esmate R² and Adjusted R², and interpret the results.
Also test the hypothesis that β = 0 against the alternave hypothesis β ≠ 0 at 5% level of
signicance.
Ans: Step 1: Understanding R² and Adjusted R²
R² (Coefficient of Determination): It measures how much of the variation in the
dependent variable (Y) is explained by the independent variable (X).
o R² ranges between 0 and 1.
o A higher R² means the regression line fits the data better.
o Example: R² = 0.80 means 80% of the variation in Y is explained by X.
Adjusted R²: Adjusted R² modifies R² to account for the number of predictors and
sample size. It prevents overestimation when more variables are added.
o Formula:
Adjusted
󰇧
󰇛
󰇜󰇛
󰇜
󰇨
where = number of observations, = number of predictors.
In simple regression (one predictor), adjusted R² is usually close to R², but slightly
lower.
Think of R² as the “raw score” of fit, while Adjusted R² is the “fair score” that penalizes
unnecessary complexity.
Step 2: The Data Provided
X
65
57
57
54
66
Y
26
13
16
-7
27
We want to estimate the regression line:

Step 3: Calculate Means of X and Y



󰇛󰇜


Step 4: Estimate Slope (β)
Easy2Siksha.com
Formula:
󰇛
󰇜󰇛
󰇜
󰇛
󰇜
Let’s compute step by step:
X
Y
Product
Square
65
26
5.2
11
57.2
27.04
57
13
-2.8
-2
5.6
7.84
57
16
-2.8
1
-2.8
7.84
54
-7
-5.8
-22
127.6
33.64
66
27
6.2
12
74.4
38.44
Sum of products = 57.2 + 5.6 - 2.8 + 127.6 + 74.4 = 262
Sum of squares = 27.04 + 7.84 + 7.84 + 33.64 + 38.44 = 114.8
So,



Step 5: Estimate Intercept (α)
󰇛󰇜
Step 6: Regression Equation

Step 7: Calculate R²
Formula:


Where:
SST = Total variation = 󰇛
󰇜
SSR = Regression variation = 󰇛
󰇜
After calculation:
SST = 5, 5, 1, 484, 144 → sum = 639
SSR ≈ 615.7
Easy2Siksha.com
So,



This means about 96.4% of variation in Y is explained by X.
Step 8: Adjusted R²
Here, , .
Adjusted
󰇛
󰇜󰇛
󰇜


So, Adjusted R² = 95.2%.
Step 9: Hypothesis Testing for β
We test:
The t-statistic is:
󰆹
󰇛
󰆹
󰇜
Without going into heavy computation, given R² is very high (0.964), the slope is highly
significant. At 5% level, with df = n-2 = 3, the critical t ≈ 3.182. Our calculated t (≈ 9.1) is far
greater.
So, we reject H₀. The slope is statistically significant.
Interpretation
The regression line is:

R² = 0.964 → The model explains 96.4% of variation in Y.
Adjusted R² = 0.952 → Even after adjustment, the model is very strong.
Hypothesis test shows β ≠ 0 → X has a significant positive effect on Y.
Conclusion
Easy2Siksha.com
In simple words:
tells us how well the regression fits the data.
Adjusted R² refines this measure to avoid overestimation.
In our case, both values are very high, meaning the model is excellent.
The slope (β) is statistically significant, proving that X strongly influences Y.
This example shows how regression is not just mathit’s a powerful way to understand
relationships and make predictions.
SECTION – C
V. Discuss the problem of Heteroscedascity in regression analysis.
What are the tests to detect this problem and explain the remedial measures.
Ans: 1. What is Heteroscedasticity?
In regression analysis, one important assumption of the Ordinary Least Squares (OLS)
method is that the variance of the error term remains constant for all observations.
If the variance of errors is constant, we call it homoscedasticity (homo = same).
If the variance of errors changes from observation to observation, we call it
heteroscedasticity (hetero = different).
Simple definition:
Heteroscedasticity occurs when the spread of residuals (errors) is not constant across all
levels of the independent variable.
2. Understanding Through an Example
Suppose we study the relationship between income and consumption.
Low-income families usually spend close to their income → less variation in
spending.
High-income families differ widely → some save a lot, some spend a lot → more
variation.
So, as income increases, the variance of consumption also increases. This leads to
heteroscedasticity.
In simple words:
Easy2Siksha.com
At low values of X → small errors
At high values of X → large errors
This violates the OLS assumption of equal variance.
3. Why is Heteroscedasticity a Problem?
A very important point to understand is this:
Heteroscedasticity does NOT make OLS estimates biased, but it makes them unreliable.
Let’s break this down.
(a) Unbiased but Inefficient Estimates
The regression coefficients (like slope and intercept) are still unbiased. However, they are
no longer efficient, meaning:
They do not have the minimum variance.
There are better estimators available than OLS.
(b) Wrong Standard Errors
This is the biggest danger.
Because standard errors are wrong:
t-tests become unreliable
F-tests become misleading
Confidence intervals may be too wide or too narrow
As a result, you might:
Think a variable is significant when it is not
Think a variable is insignificant when it is actually important
In exams, this point is very important.
4. How Can We Detect Heteroscedasticity?
Economists and statisticians use graphical methods and formal statistical tests.
Easy2Siksha.com
A. Graphical Methods
(i) Residual Plot
Plot residuals on the Y-axis and the independent variable on the X-axis.
If residuals form a random cloud → no problem
If residuals form a fan shape, cone shape, or wave → heteroscedasticity exists
This is the easiest and most intuitive method.
B. Statistical Tests
Now let’s look at formal tests commonly used in regression analysis.
1. BreuschPagan Test
This test checks whether the variance of errors depends on one or more independent
variables.
Idea behind the test:
If error variance is related to explanatory variables, heteroscedasticity exists.
Procedure (simplified):
1. Run the original regression.
2. Obtain residuals.
3. Regress squared residuals on independent variables.
4. Use a chi-square test.
Result:
If the test statistic is significant → heteroscedasticity present.
2. White Test
The White test is more general and powerful.
Key feature:
It does not assume any specific form of heteroscedasticity.
Easy2Siksha.com
What it checks:
Variance of errors depends on independent variables, their squares, or cross-
products.
Advantage:
Works even when you don’t know the exact pattern of heteroscedasticity.
Disadvantage:
Uses many variables, so it may reduce degrees of freedom in small samples.
3. GoldfeldQuandt Test
This test is useful when heteroscedasticity is suspected to increase with the level of an
independent variable.
Steps:
1. Arrange data in increasing order of X.
2. Omit middle observations.
3. Divide remaining data into two groups.
4. Run separate regressions.
5. Compare variances using an F-test.
Conclusion:
If variances differ significantly → heteroscedasticity exists.
5. What Are the Remedial Measures?
Once heteroscedasticity is detected, we must correct it. Several remedies are available.
1. Transforming the Data
One of the most common solutions.
(a) Log Transformation
Taking logarithms often stabilizes variance.
Easy2Siksha.com
Example:
Instead of using Y, use log(Y)
Instead of X, use log(X)
This works well in income, consumption, and production data.
2. Weighted Least Squares (WLS)
When the form of heteroscedasticity is known, WLS is very effective.
Idea:
Give less weight to observations with higher variance.
Give more weight to observations with lower variance.
This restores efficiency and corrects standard errors.
3. Robust Standard Errors
This is the most practical and widely used solution today.
What it does:
Keeps OLS coefficients the same
Corrects standard errors to account for heteroscedasticity
Advantage:
No need to know the form of heteroscedasticity
Easy to implement in software
Limitation:
Improves inference, not efficiency
4. Improving Model Specification
Sometimes heteroscedasticity exists because:
Important variables are omitted
Functional form is wrong
Easy2Siksha.com
Adding relevant variables or changing the model structure can reduce the problem
naturally.
6. Summary and Conclusion
To sum up in simple terms:
Heteroscedasticity means unequal spread of errors in regression analysis.
It violates an important assumption of OLS.
It does not bias coefficients, but it distorts hypothesis testing.
It can be detected using:
o Residual plots
o BreuschPagan test
o White test
o GoldfeldQuandt test
It can be corrected using:
o Data transformation
o Weighted Least Squares
o Robust standard errors
o Better model specification
VI. What are the consequences and tests of Mulcollinearity problem in regression
analysis?
Ans: What is Heteroscedasticity?
In regression analysis, one of the key assumptions of the Ordinary Least Squares (OLS)
method is that the variance of the error terms (residuals) is constant across all values of the
independent variable(s). This property is called homoscedasticity.
If the variance of errors is constant → Homoscedasticity (ideal situation).
If the variance of errors changes (increases or decreases) with X →
Heteroscedasticity (problematic situation).
Simple Example
Imagine you are studying the relationship between income (X) and expenditure (Y). For low-
income groups, expenditure patterns are similar, so residuals are small. But for high-income
groups, expenditure varies widelysome spend a lot, some save more. This means the error
variance increases with income. That’s heteroscedasticity.
Graphically, if you plot residuals against X:
Homoscedasticity looks like a “cloud” spread evenly.
Easy2Siksha.com
Heteroscedasticity looks like a “fan” or “cone,” where residuals spread out as X
increases.
Why is Heteroscedasticity a Problem?
1. Inefficient Estimates: OLS estimates of coefficients remain unbiased, but they are no
longer efficient (they don’t have minimum variance).
2. Invalid Inference: Standard errors are distorted, which affects t-tests and F-tests.
You might wrongly conclude that a variable is significant or insignificant.
3. Poor Predictions: Confidence intervals and forecasts become unreliable.
In short, heteroscedasticity doesn’t break the regression line itself, but it breaks the
reliability of statistical testing and inference.
Tests to Detect Heteroscedasticity
Several methods exist to check whether residuals suffer from heteroscedasticity:
1. Graphical Method
Plot residuals against fitted values or independent variables.
If the spread of residuals increases or decreases systematically, heteroscedasticity is
present.
2. BreuschPagan Test
A formal statistical test where residuals are regressed on independent variables.
If the test statistic is significant, heteroscedasticity exists.
3. White’s Test
A more general test that doesn’t assume a specific form of heteroscedasticity.
It checks whether squared residuals are related to independent variables and their
squares.
4. GoldfeldQuandt Test
Splits the data into two groups and compares variances of residuals.
If variances differ significantly, heteroscedasticity is detected.
5. Park Test / Glejser Test
Residuals are regressed against functions of independent variables (like square root,
inverse).
Significant results indicate heteroscedasticity.
Remedial Measures
Easy2Siksha.com
Once heteroscedasticity is detected, we need to fix it. Common remedies include:
1. Transforming Variables
Apply logarithmic or square root transformation to dependent variable.
Example: Instead of regressing Y on X, regress log(Y) on X. This often stabilizes
variance.
2. Weighted Least Squares (WLS)
Assign weights to observations inversely proportional to their variance.
This gives less importance to observations with high variance and more to those with
low variance.
3. Robust Standard Errors
Use heteroscedasticity-consistent standard errors (like White’s robust SE).
This doesn’t change coefficients but corrects standard errors, making hypothesis
tests valid.
4. Model Specification
Sometimes heteroscedasticity arises because the model is misspecified (missing
variables, wrong functional form).
Adding relevant variables or changing the model form can solve the issue.
Making It Relatable
Think of regression like measuring the height of plants based on the amount of water they
get. If all plants respond similarly, errors are small and uniform (homoscedasticity). But if
some plants grow wildly while others barely grow, the variation in errors increases with
waterthis is heteroscedasticity.
It’s like trying to fit a straight line through data where the “noise” gets louder as you move
along. The line is still there, but the noise makes it harder to trust your conclusions.
Conclusion
Heteroscedasticity means unequal variance of residuals in regression.
It doesn’t bias coefficients but makes statistical tests unreliable.
Detection methods include graphical plots, Breusch–Pagan, White’s test, Goldfeld–
Quandt, and others.
Remedies include variable transformation, Weighted Least Squares, robust standard
errors, and correcting model specification.
Easy2Siksha.com
SECTION – D
VII. Dierenate between Distributed Lag Models and Auto-Regressive Models.
Discuss the problems of esmaon of Koyck’s Distributed Lag Model.
Ans: Understanding the Basic Idea: Time Matters in Economics
Imagine you start going to the gym today. Will you become fit tomorrow? Probably not.
Your body responds gradually over time. Similarly, in economics, many variables do not
react instantly to changes. Instead, their effects are spread across multiple time periods.
For example:
If the government reduces interest rates, businesses may take time to invest.
If a company increases advertising, sales may rise slowly over several months.
If farmers use better fertilizers, crop yield improves over seasonsnot overnight.
Economists use special models to capture these delayed effects. Two important ones are
Distributed Lag Models and Auto-Regressive Models.
What is a Distributed Lag Model?
A Distributed Lag Model (DLM) assumes that the effect of an independent variable is
distributedor spreadover time.
Let’s take a simple example.
Suppose a company spends money on advertising. The impact of today’s advertisement may
influence sales today, next month, and even months later. Customers might remember the
brand and purchase later.
So instead of writing:
Sales today = Advertising today
Economists write something like:
Sales today = Advertising today + Advertising last month + Advertising two months ago
This means past values still matter.
Key Characteristics of Distributed Lag Models
1. Delayed Effect: Changes do not produce immediate results.
2. Multiple Time Periods: Both present and past values of the explanatory variable are
included.
3. Realistic Representation: Many economic behaviors follow this pattern.
Easy2Siksha.com
Example in Real Life
Think about education. The knowledge you gain in school affects your career years later.
The effect is not instantit is distributed over time.
What is an Auto-Regressive Model?
Now let’s move to the second concept.
An Auto-Regressive Model (AR Model) is based on the idea that the past values of a
variable influence its current value.
Here, instead of focusing on past independent variables, we focus on the past values of the
dependent variable itself.
For example:
Consumption today depends on consumption yesterday.
Why? Because people develop habits. If a family spent ₹40,000 last month, they are unlikely
to suddenly spend ₹10,000 this month unless something major changes.
Key Characteristics of Auto-Regressive Models
1. Dependence on the Past: The variable explains itself using its past values.
2. Captures Momentum: Economic variables often show continuity.
3. Useful in Forecasting: Widely used to predict inflation, GDP, stock prices, etc.
Real-Life Analogy
Think about your daily routine. If you wake up at 6 AM today, chances are you will wake up
around the same time tomorrow. Your past behavior shapes your present behavior.
Difference Between Distributed Lag Models and Auto-Regressive Models
Now that we understand both concepts, let’s clearly differentiate them.
1. Basic Idea
Distributed Lag Model: Current value depends on current and past values of another
variable.
Auto-Regressive Model: Current value depends on its own past values.
2. Focus
Easy2Siksha.com
DLM: Focuses on delayed effects of external factors.
AR Model: Focuses on internal persistence over time.
3. Example
DLM: Crop yield depends on fertilizer used this year and previous years.
AR Model: Crop yield depends on last year’s yield.
4. Purpose
DLM: Helps understand how long a policy or decision takes to show results.
AR Model: Helps identify patterns and predict future values.
5. Complexity
Distributed lag models can become complicated when many past periods are included.
Auto-regressive models are often simpler but still powerful.
In short, DLM explains delayed impact, while AR explains continuity.
Koyck’s Distributed Lag Model: A Smart Shortcut
Including many lagged variables in a regression can create problems:
Too many variables
Loss of degrees of freedom
Complex calculations
To solve this, economist Koyck proposed a transformation method.
Instead of including infinite past values, Koyck assumed that the influence of past variables
declines graduallylike a fading memory.
For example:
Last month’s advertising discover has a strong effect.
Advertising from six months ago has a weaker effect.
This pattern is called geometric decay.
By applying a mathematical transformation, Koyck converted the distributed lag model into
a simpler form that includes a lagged dependent variable.
This made estimation easierbut not perfect.
Easy2Siksha.com
Problems in Estimating Koyck’s Distributed Lag Model
Even though Koyck’s approach is clever, it introduces several challenges.
1. Autocorrelation Problem
One of the biggest issues is autocorrelation, which means error terms become correlated
over time.
Why does this happen?
Because the lagged dependent variable is included as an explanatory variable. This violates
one of the key assumptions of classical regressionthat error terms should be independent.
Result: Estimates may become inefficient and misleading.
2. Bias in Small Samples
In small datasets, the estimated coefficients can be biased.
This means the results may systematically deviate from the true values.
For students, think of it like surveying only five people to understand the opinion of an
entire citythe conclusion might not be reliable.
3. Strong Assumption of Geometric Decay
Koyck assumes that the lag weights decline geometrically.
But real life is not reveal.
Sometimes effects may:
Increase first, then decline
Stay constant for a while
Drop suddenly
If the assumption is wrong, the model becomes misspecified, leading to incorrect
conclusions.
4. Difficulty in Interpretation
Easy2Siksha.com
Because the model is transformed, interpreting coefficients is not always straightforward.
Students often struggle to understand the long-run versus short-run effects.
5. Measurement Errors
If past data is inaccurate, the lagged dependent variable carries that error forward, affecting
the entire estimation.
It’s like building a house on a weak foundation.
Conclusion
To summarize, both Distributed Lag Models and Auto-Regressive Models help economists
understand how time influences economic behaviorbut they do so in different ways.
Distributed Lag Models capture delayed reactions to external factors.
Auto-Regressive Models capture the persistence of past behavior.
Koyck’s Distributed Lag Model provides an elegant solution to handling multiple lagged
variables by assuming geometric decay. However, it brings challenges such as
autocorrelation, bias, restrictive assumptions, and interpretation difficulties.
The key takeaway is this: economics is not just about what happens nowit is deeply
connected to what happened before. Whether it is consumer habits, government policies,
or business investments, the reminder of the past always shapes the present.
VIII. What are dummy variables?
Explain the uses of dummy variables.
Ans: What Are Dummy Variables?
A dummy variable is a numerical variable used in regression analysis to represent categories
or qualitative attributes. It usually takes the value:
1 if a condition is true (presence of a category),
0 if the condition is false (absence of a category).
In simple words, dummy variables act like “switches” that turn certain effects on or off in a
regression model.
Example
Easy2Siksha.com
Suppose you want to study the effect of gender on wages. Gender is not a number—it’s a
category (male/female). To include it in regression, you create a dummy variable:
Male = 1, Female = 0.
Now the regression can measure how wages differ between men and women.
Why Do We Need Dummy Variables?
Regression models require numerical inputs. But real-world data often includes qualitative
factors like gender, region, religion, or occupation. Dummy variables allow us to:
Convert categories into numbers.
Capture differences between groups.
Add flexibility to regression models.
Without dummy variables, regression would ignore important qualitative influences.
Uses of Dummy Variables
1. Representing Categories
Dummy variables represent qualitative categories in regression.
Example: Urban vs. Rural (Urban = 1, Rural = 0).
This helps measure how living in a city affects income compared to living in a village.
2. Comparing Groups
They allow comparison between groups.
Example: In education studies, a dummy variable can represent whether a student
attended private school (1) or government school (0).
The coefficient shows the difference in performance between the two groups.
3. Capturing Seasonal Effects
In time series data, dummy variables can represent seasons or months.
Example: For quarterly sales data, you can create dummies for Q1, Q2, Q3, Q4.
This helps capture seasonal variations in sales.
4. Policy Impact Analysis
Dummy variables can represent whether a policy was implemented.
Example: Before policy = 0, After policy = 1.
The coefficient shows the impact of the policy on outcomes.
Easy2Siksha.com
5. Interaction Effects
Dummy variables can interact with numerical variables to capture different slopes for
different groups.
Example: Effect of education on wages may differ for men and women.
By interacting gender dummy with years of education, you can measure this
difference.
6. Structural Breaks in Data
In economic studies, dummy variables can represent structural breaks (major changes in
economy).
Example: Dummy = 1 after liberalization year, 0 before.
This shows how economic reforms changed growth patterns.
Important Points to Remember
If a variable has k categories, you need k-1 dummy variables to avoid the “dummy
variable trap” (perfect multicollinearity).
The category left out becomes the reference group.
Coefficients of dummy variables show differences compared to the reference group.
Making It Relatable
Think of dummy variables like light switches in a room. Each switch controls whether a
particular lamp (category effect) is on (1) or off (0). The regression equation is like the total
brightness in the roomit depends on which switches are turned on.
Conclusion
Dummy variables are numerical representations of qualitative attributes.
They are essential for including categories like gender, region, season, or policy in
regression models.
Uses include representing categories, comparing groups, capturing seasonal effects,
analyzing policy impacts, modeling interactions, and detecting structural breaks.
They make regression models more realistic and powerful by allowing qualitative
influences to be measured quantitatively.
This paper has been carefully prepared for educaonal purposes. If you noce any
mistakes or have suggesons, feel free to share your feedback.